Context Based Word Prediction for Texting Language

نویسندگان

  • Sachin Agarwal
  • Shilpa Arora
چکیده

The use of digital mobile phones has led to a tremendous increase in communication using SMS. On a phone keypad, multiple words are mapped to same numeric code. We propose a Context Based Word Prediction system for SMS messaging in which context is used to predict the most appropriate word for a given code. We extend this system to allow informal words (short forms for proper English words). The mapping from informal word to its proper English words is done using Double Metaphone Encoding based on their phonetic similarity. The results show 31% improvement over the traditional frequency based word estimation. Introduction The growth of wireless technology has provided us with many new ways of communication such as SMS (Short Message Service). SMS messaging can also be used to interact with automated systems or participating in contests. With tremendous increase in Mobile Text Messaging, there is a need for an efficient text input system. With limited keys on the mobile phone, multiple letters are mapped to same number (8 keys, 2 to 9, for 26 alphabets). The many to one mapping of alphabets to numbers gives us same numeric code for multiple words. Predictive text systems in place use the frequency-based disambiguation method and predict the most commonly used word above other possible words. T-9 (Text on 9-keys), developed by Tegic Communications, is one such predictive text technology used by LG, Siemens, Nokia Sony Ericson and others in their phones. iTap is another similar system developed and used by Motorola in their phones. T-9 system predicts the correct word for a given numeric code based on frequency. This may not give us the correct result most of the time. For example, for code ‘63’, two possible words are ‘me’ and ‘of’. Based on a frequency list where ‘of’ is more likely than ‘me’, T-9 system will always predict ‘of’ for code ‘63’. So, for a sentence like ‘Give me a box of chocolate’, the prediction would be ‘Give of a box of chocolate’. The sentence itself indeed gives us information about what should be the correct word for a given code. Consider the above sentence with blanks, “Give _ a box _ chocolate”. According to the English grammar, it is more likely that ‘of’ comes after a noun ‘box’ than ‘me’ i.e. it is more likely to see the phrase “box of” than “box me”. The algorithm proposed is an online method that uses this knowledge to correctly predict the word for a given code considering its previous context.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introspective Study of Emotion Icon in Public Chat as a Gesture of Texting

An emotion icon, better known as emoticon is a metacommunicative pictorial representation of a facial expression that, in the absence of body language and prosody, serves to draw a receiver's attention to the tenor or temper of a sender's nominal verbal communication, changing and improving its interpretation. The present study investigates the use of these nonverbal cues in whatsapp public cha...

متن کامل

Context-Based Word Prediction and Classification

This paper presents a new approach for word prediction problem. Word prediction is a natural language processing problem that tries to predict the correct word in a given context. Word completion utilities, writing aids, and language translation are among the most common applications of word prediction. In this paper, we describe a new method to predict the correct word given its context. A dat...

متن کامل

First Language Activation during Second Language Lexical Processing in a Sentential Context

 Lexicalization-patterns, the way words are mapped onto concepts, differ from one language      to another. This study investigated the influence of first language (L1) lexicalization patterns on the processing of second language (L2) words in sentential contexts by both less proficient and more proficient Persian learners of English. The focus was on cases where two different senses of a polys...

متن کامل

Spelling-based Phonics Instruction: It’s Effect on English Reading and Spelling in an EFL Context

Systematic phonics instruction in first language education has recently received considerable research attention due to its critical role in facilitating phonological awareness and processing skills. However, little is known about the effects of systematic phonics instruction on foreign language reading and spelling in an EFL context. This study examined the effects of spelling-based phonics in...

متن کامل

Written word recognition by the elementary and advanced level Persian-English bilinguals

According  to  a  basic  prediction  made  by  the  Revised  Hierarchical  Model  (RHM),  at  early  stages  of language  acquisition,  strong  L2-L1  lexical  links  are  formed.  RHM  predicts  that  these  links  weaken with  increasing  proficiency,  although  they  do  not  disappear  even  at  higher  levels  of  language development. To test this prediction, two groups of highly proficie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007